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Abstract 



We introduce a new class of infrared safe jet observables, which we refer to as template 
overlaps, designed to filter targeted highly boosted particle decays from QCD jets and other 
background. Template overlaps are functional measures that quantify how well the energy 
flow of a physical jet matches the flow of a boosted partonic decay. Any region of the partonic 
phase space for the boosted decays defines a template. We will refer to the maximum 
functional overlap found this way as the template overlap. To illustrate the method, we test 
lowest-order templates designed to distinguish highly-boosted top and Higgs decays from 
backgrounds produced by event generators. For the functional overlap, we find good results 
with a simple construction based on a Gaussian in energy differences within angular regions 
surrounding the template partons. Although different event generators give different averages 
for our template overlaps, we find in each case excellent rejection power, especially when 
combined with cuts based on jet shapes. The template overlaps are capable of systematic 
improvement by including higher order corrections in the template phase space. 



1 Introduction 



At the Large Hadron Collider, QCD will produce hadronic final states of unprecedented 
complexity, and most searches for beyond-standard model physics will have to contend with 
large backgrounds. Over the past few years, scenarios have been proposed in which heavy 
particles, including the Higgs and top quark, are produced at large transverse momentum 
[TT] . At high enough p^, their decay products will appear as heavy, collimated jets [121 
[13]. Even such exotic final states, however, will coexist with a substantial tail of the mass 
distribution of light-parton QCD jets [T11II5], and it will generally be necessary to study jet 
substructure systematically to distinguish such a signal. 

A number of methods to analyze high-pr jets have been proposed and tested (so far) 
against the outputs of event generators. Generally, these methods depend on differences in 
the substructure of light-parton QCD jets compared to those from particle decays. Diagnos- 
tics to detect this difference include infrared safe event shapes [ISlIlT], and direct analyses 
of jet substructure [5| [TSH2B] . To this crowded field we propose a new method, based on a 
direct quantitative comparison of the energy flow of observed jets at high-p^- with the flow 
from specific partonic decay modes of boosted heavy particles. Especially when combined 
with event shape information, the analysis of energy flow provides a potentially powerful 
tool. 

Before going into details, we note that energy flow is a natural language for the description 
of jet structure. Jet cross sections are naturally described in terms of correlation functions 
of energy flow [24j, which can be interpreted as correlations of the energy-momentum tensor 
on the sphere "at infinity" [251428] . For QCD, these correlations tend to be strongly peaked, 
of course, around jets that may represent the scattering or production of the partons of QCD 
or the decays of short-lived resonances reflecting new dynamics. 

It is interesting to draw a contrast between QCD and the analogous problem for the 
cosmic background temperature, where the power distribution is very smooth. Indeed, mo- 
tivated by observation as well as the inflation paradigm one expects for this case a nearly 
scale invariant, almost featureless, differential power spectrum. Hence, the CMB power spec- 
trum, as well as the microscopic physics of the primordial epoch of inflation, is conveniently 
described by two and three point correlation functions of the power spectrum in momentum 
space [22]. Similarly, in case of conformal dynamics, the energy distribution resulting from 
hard scatterings can be well described by energy-energy correlation functions in momentum 
space [28], and again is found to be smoothly distributed, almost spherically symmetric. At 
first sight, energy flow in jet events could not be more different. The search for the origin of 
a given jet, however, whether from QCD radiation or from decay, may benefit from taking 
a similar viewpoint, based on the pattern of energy correlations within jets. In this paper, 
we will present a method for such a quantitative study, with the aim of identifying jets that 
correspond to resonance decay. We will refer to this as a "template" method, in which we 
use our knowledge of the signal to design a custom analysis for each resonance, to make use 
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of differences in energy flow between signal and background. 

We can summarize the template overlap procedure as follows. We denote by \ j) the set of 
particles or calorimeter towers that make up a jet, identified by some algorithm, and take |/) 
to represent a set of partonic momenta pi . . . Pn that represent a boosted decay, found by the 
same algorithm. We will introduce a functional measure /) = {j\f) that quantifies how 
well the energy flow of \j) matches |/). Any region of partonic phase space for the boosted 
decays, {/}, defines a template. We will often define the template overlap of observed jet j 
as Ov{j, f) = max {J} /), the maximum functional overlap of j to a state f[j] within the 
template region. Template overlaps provide us with a tool to match unequivocally arbitrary 
final states j to partonic partners f[j] at any given order. Once a "peak template" f[j] 
is found, we can use it to characterize the energy flow of the state, which gives additional 
information on the likelihood that it is signal or background. 

To make the matching between physical and template possible, each event is characterized 
by some set of particle or calorimeter energies, E{9i, (f)j)ij^R, where E is the energy and 9 
and (j) represent coordinates internal to a jet with cone or related parameter R. In a typical 
experimental setup the energy is discretized according to the detector resolution, and each 
pair i,j corresponds to a specific cell in the calorimeter. At the LHC experiments [30], for 
instance, electromagnetic calorimeter cell size (in rj and 0) is of C (0.025 x 0.025) and of 
(9(0.1 X O.l) for hadronic calorimeter cells. For each event, the overlap with the template 
states is calculated. 

In general, for each state j, the template state f[j] with maximal overlap with j will 
be used to characterize the event j. We therefore adopt the ansatz that a good (if not 
the best) rejection power is obtained when we use the signal distribution itself to construct 
our templates (see e.g. [29]). At lowest order all the information encoded in the events is 
matched uniquely to the lowest order template with maximum overlap. After showering 
and hadronization, this correspondence is diluted, but as we shall see, very meaningful 
correlations remain. 

The application of these ideas is particularly straightforward for top jets. Much of the 
QCD background is characterized by two sub-jets, with very different energy flow from the 
three-parton templates in general. Indeed, for a lowest order partonic QCD jet consisting 
of the original parton plus one soft gluon, there is no template state from top decay that 
matches the energy flow. This gives a fundamental discrimination, to which we can add 
additional information from event shapes. 

Having given a rationale for the template method, in the following section we provide 
a general formalism to describe it. In Sec. [3] we apply the method to templates tailored 
to a boosted top search. In this noted above, the three-particle structure of the 

lowest-order templates gives a clear distinction between signal and background, which we 
amplify further by the use of other infrared safe event shapes. Comparisons are carried out 
using anti-Zcy jet flnders for events found from several Monte Carlo (MC) generators. In 
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each case, we find large background rejection powers based on this analysis, with substantial 
efficiencies. 

Highly boosted Higgs decays are discussed in Sec. IH In this case, the signal and back- 
ground are both two-parton states at lowest order (LO). Their template overlap distributions 
are slightly different, but here we use another feature of the template method: the unique- 
ness of the template state with maximum overlap. This information provides us with an 
additional, infrared safe tool, which will enable us to attain significant rejection power even 
in this case. We conclude in Sec. |5l 



2 Overlap Formalism 

We want our template overlaps to be functionals of energy flow of any specific event (usually 
involving jets), which we label j, and a model, or template, for the energy flow in a signal, 
referred to as / . Our templates will be a set of partonic momenta f = Pi ■ ■ -Pn, with 

n 

Y,Pi = P^ P' = M\ (1) 

i=l 

which we take to represent the decay products of a signal of mass M. For example, the 
lowest-order template for Higgs decay would have n = 2 and for top decay, n = 3. Of course, 
templates with more than the minimum number of particles are possible. To represent the 
sum over this n-particle phase space, we introduce the notation 

-i^'-/np|l:-*v-|:p.)e(te}.fi). (2) 

where the function Q{{pi}, R) limits the phase space integral to some region, R, which may 
represent a specific cone size, for example. 

We would like to measure how well the energy flow of any given event j matches that 
of the signal on the unit sphere, denoted by Q. We represent the template energy flow as 
dE{f = pi . . .pn)/dQ. This function is taken at fixed (to start with, lowest) order. Similarly, 
we will represent the energy flow of event j as dE{j)/dQ. This quantity is observed, either in 
experiment or the output of an event generator. Schematically, a general overlap functional 
Ov{j, f) is represented as 



Ov{j,f) = {j\f) = J' 



dE{j) dE{f) 



dn ' dn 

In principle, the choice of the functional J-" is arbitrary. 



(3) 



A natural measure of the matching between state j and the template is the weighted 
difference of their energy flows integrated over some specific region that includes the template 
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momenta pi. To quantify this difference, we construct the functional J-" using the template 
states. We will find it useful to identify the difference in terms of the template configuration 
in n-particle phase space with the closest match of energy flow to a given state j. As 
a measure of the matching we introduce a function that is maximized at x = to 

$(0) = 1, which represents a "perfect" match. A simple example, which we will employ 
below, is a Gaussian, 



max (fl) exp 



1 



2a| 



dE{j) dE{f) 



dVL 



dQ 



(4) 



where we introduce a width, aE with units of energy. For infrared safety, the function 
F{fl, /) should be a sufficiently smooth function of the angles for any template state / [5T]. 
For example, it could be defined as a Gaussian around each of the directions of the template 
momenta [32]. Alternately, we may choose F to be a normalized step function that is nonzero 
only in definite angular regions around the directions of the template momenta Pi [33]. This 
is the method we will use below. We emphasize that the choice of our overlap functional 
is to a large extent arbitrary, subject to the requirements of infrared safety. We will find, 
however, that relatively simple choices can give strong enrichment of signals. 

To be specific, for an n-particle final state, we will represent our template overlap (drop- 
ping the superscript (F)) as 



Ov{j,pi ...pr. 



max (H) exp 



a=l 



1 

2"^ 



, (5) 



1 1\ 

where the direction of template particle a is ha and its energy is Ea ■ In applications below, 
we will use these energies to set the widths of the Gaussians. The functions 9{h, ha) restrict 
the angular integrals to (nonintersecting) regions surrounding each of the template momenta. 
We will refer to the corresponding state as the "peak template" /[j] for state j. The peak 
template f[j\ provides us with potentially valuable information on energy flow in j. 

In summary, the output of the peak template method for any physical state j is the value 
of the overlap, Ov{j,f), and also the identity of the template state f[j] to which the best 
match is found. As we shall see, this will be of particular value when we apply our method 
to boosted Higgs. We turn first, however, to the analysis for boosted tops. 



3 Three-particle Templates and Top Decay 

In this section, we illustrate the peak template method for top identification, using as a 
template the LO partonic three-particle phase space of top decay. The essential observation 
is that light-quark and gluon jets (generally referred to as "QCD jets" below) typically have 
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states with a two-subjet topology. Such states generally do not match well with a three- 
particle template, and so are easy to separate from the signal on the basis of their low values 
of Ov [I5l[l7]. Of course, some top decay states have low values of Ov also, and some QCD 
jets higher values. We will see how to combine the template overlap with planar flow to 
develop a filter that enriches the top signal at relatively high efficiency. 

3.1 Peak template overlap method 

We begin with a detailed description of the peak template overlap method with the LO 
three-parton templates appropriate to top jet analysis. 

3.1.1 Mass cut and discretization of data 

First, we select data using a jet mass window for the top, choosing 160 GeV < rrij < 
190 GeV, with top mass chosen to be 174 cone size R = 0.5 {D = 0.5 for anti-/cT jet 

algorithm [M] we are using) and jet energy 950 GeV < Pq < 1050 GeV. In our demonstration, 
we choose a discretized 6-(f) plane, with AO = 0.06 and A0 = 0.1. Then, we can build a 
table of energy E{tow^, columUn), where roWm and columUn are the row and column number 
corresponding to the discretized 6 and 0. 

3.1.2 Construction of template states 

We wish to generate a sufficient number of template states to cover three-particle phase space 
for top decay, t^b + W^b + q + q. Imposing the condition, {pg + = M^, there are 
four degrees of freedom. To construct our set of states, we have chosen a brute force method, 
based on four angles. We take two of these to be the polar and azimuthal angles that define 
the b and W directions in the top rest frame, defined relative to the direction of the boost 
from this frame to the lab frame. The remaining two are again polar and azimuthal angles, 
that define the q and q directions, this time relative to the boost axis from the W rest frame 
to the top rest frame. This method is by no means unique. 

By straightforward Lorentz transformations of particle momenta, the four angles iden- 
tified above determine the energies and directions of the three decay products of the top 
at LO. We neglect the possible effects of spin and polarization at the particle level in our 
construction of template states0 

^We choose this value for the purpose of demonstration only, and the running of the top mass may be 
important. 

^In [17] it was shown that for boosted two pronged decays, energy flow is very similar for massive spin 
zero and spin one. 



5 



For this investigation, we discretize all four physical angles with a discretization length of 
0.1. As for the discretization of the data, we encode two physical angles in terms of row and 
column number corresponding to the data discretization scheme. A given template consists 
of a list (rowa, columua, E^, a=l,2,3) for each of three daughter particles of hardronic top (6, 
q and q). We exclude those templates having particles whose polar angles, 9 relative to the 
jet axis, are larger than the cone size R. Also, we impose an energy cut on the templates, 
removing templates that have an energy less than 10 percent of the maximum energy for a 
given particle. The number of template states, constructed as above, that pass these cuts 
is very large, of order three million. We are confident, therefore, that the maximum overlap 
found with this set is very close to the true maximum. We emphasize that, once generated, 
the same set of template states is used for all the data. 

3.1.3 The template overlap 

We next define an overlap between template, |/), and a specific jet energy configuration 
{j\f). Following Eq. ([5]), we set 

Ov{j, f) = max (ij) exp 

where E{ia,ja)^^^ is the energy for the template particle a, whose direction is labelled by 
indices ia and ja, according to the discretization table described above. For our analysis, we 
fix (Ta (for the ath parton) by that parton's energy, 

a, = E(^„j,)(^V2. (7) 

In Eq. ([6]), we define the overlap between data state j and template / on the basis of 
an unweighted sum of all the energy in the total of nine cells of state j surrounding (and 
including) each of the three cells populated by a particle in state /. If one of the cells is 
located on the edge of the cone in the direction of the polar angle with respect to the jet 
direction, the number of cells included in the sum is simply taken to be smaller. 

3.2 Peak template overlaps for top and QCD jets 

We can now apply the peak template function method discussed in the previous sections 
to analyze energetic top jet events vis-a-vis QCD jets. We use the data for QCD jet and 
hadronic top jet events, for R = 0.5, 950 GeV< Pq <1050 GeV, 160 GeV< mj <190 
GeV and m^p = 174 GeV as obtained via the anti-/cT jet clustering algorithm [31] with 
CTEQ6M PDF set [35J . The main purpose of this section is to understand how well we can 
discriminate our signal from the potentially overwhelming QCD background by using the 
simplest three-point correlation template functions. 



ia + l ja + 1 



a=l 



(6) 



-l«=ja-l 
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In Fig. [T]we compare the overlap distributions for showered top jets and QCD jets (for 
the same z = nij/ Pq) for event generators Pythia (version 8) [36] for 2 — i- 2 process without 
matching, MadGraph/MadEvent (MG/ME) 6.4 [37] (with MLM matching [38] interfaced 
into Pythia V6.4 [39]), and Sherpa 1.2.1 [lOlIlT]. It is clear that the showering smears 
the top distributions significantly, although top events tend to yield somewhat larger peak 
overlaps. Note also the large variations between the generators. 
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Figure 1: Comparison of histograms of template overlap Of, Eq. ([6]), with top jets and QCD 
jets from different MCs [upper left (right) Pythia (MG/ME) and Sherpa on the bottom], for 
R = 0.5, 950 GeV< Pq <1050 GeV, 160 GeV< mj <190 GeV and m^p = 174 GeV. 



3.3 Planar flow 



We have seen that LO top templates already distinguish noticeably between top and QCD 
jets. There is still a close relation, with both distributions being fairly flat. To gain a 
better resolution between the two possibilities, we shall rely on the jet shape variable, planar 
flow [inillT]. For completeness we give the definition of the planar fiow variable, Pf. First 
construct for a given jet, a matrix as 



rkl 1 \^,Pi,kPi,l 



(8) 
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where mj is the jet mass, ui is the energy of particle i in the jet, and pi^k is the k component 
of its transverse momentum relative to the axis of the jet's momentum. The Pf variable is 
defined as 

4det(/J _ 4A1A2 
^ tr(/J2 (Ai + A2)2' 

where Ai^2 are the eigenvalues of I^. We shall see that planar flow distinguishes between 
many three-jet events with large template overlaps. In general, QCD events with large 
Ov will have significantly smaller planar flow than top decay events. For the QCD jets a 
large overlap would be a result of a kinematic "accident". In the studies we show below, 
the combination of Ov and Pf gives a strong background (QCD) suppression with quite 
substantial signal (top decay) efficiency. 

In Fig. [21 we test these ideas by plotting the template overlap Ov for the partonic level 
output of a MC, versus Pf. The data shows a scatter plot of Ov and Pf found in this way. 
The data are all close to unity in Ov, but are (as expected) spread out in planar flow. As we 
may conclude by looking back at Fig. [T], the effect of showering is to spread out top decays 
over the full range of Ov. 
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Figure 2: A scatter plot of template overlap, Eq. (|6]) and Pf for LO parton-level MC output 
for top quark decay, with Pq = I TeV, rritop = 174 GeV. 



3.4 Application to top decay 

In Fig. [3] we show a comparison of scatter plots of planar flow, Pf vs. template overlap, Ov 
with QCD (first column) top jets (second column) from different MC (from top to bottom: 
Pythia, MG/ME, Sherpa), for R = 0.5, 950 GeV< Pq <1050 GeV, 160 GeV< mj <190 
GeV and rritop = 174 GeV. The three event generators provide rather different distributions, 
but in each case the distinction between the signal and background distributions is evident. 
Clearly, any set of events chosen from the upper right of these plots, with Pf > Ov, is highly 
enriched in top events compared with background. The clear differences in these scatter plots 
show the potential of the template overlap method. 
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Figure 3: Comparison of scatter plots of planar flow Pf vs. template overlap Ov for top 
jets (right) and QCD jets (left) from different MC (from top to bottom: Pythia, MG/ME, 
Sherpa), for R = 0.5, 950 GeV< Pq <1050 GeV, 160 GeV< mj <190 GeV and rritop = 174 
GeV. 
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Figure 4: Comparison of fake rate vs. efficiency with various cuts on template overlap Ov 
and Pf with top jets and QCD jets from different MC [upper left (right) Pythia (MG/ME) 
and Sherpa on the bottom], for R = 0.5, 950 GeV< Pq <1050 GeV, 160 GeV< mj <190 
GeV and rritop = 174 GeV. The lines show the effect of cuts in planar flow (Pf) for fixed 
overlap [Ov), with the lowest (most inclusive) Pf cuts to the right. The green dot is for 
Pf > 0.6 and Ov > 0.4. 
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As a simple application of these ideas, in Fig. HJ we show fake rate vs. efficiency with 
various cuts on the template overlap Ov as found from Fig. [31 For a given cut on Ov denoted 
by a same-colored line, the efficiency is controlled by the upper cut on Pf. Each point on 
one of these curves corresponds to a specific choice of Pf at fixed Ov, and hence to the set of 
points within a rectangle that includes the upper right corners of the corresponding scatter 
plots in Fig. [31 The results depend on the choice of Ov cut, but it is clear that any cut above 
0.2 leads to a substantial increase in efficiency. We present these results for demonstration 
purposes only, and have not carried out a systematic study of how to maximize rejection 
power. 

Our final results for the top jet case are summarized in TablelUfor the three different event 
generators, chosen for the best working point found by these simple, naive one- dimensional 
cuts in Ov and Pf. It is evident from the numbers presented that the template overlap 
method works well for events generated by any of the MC generators. In each case, we find 
a large enhancement of signal compared to background, typically of the order of fifteen or 
more. Taking into account the rejection of QCD jets by imposing a mass window, these 
numbers (for a single massive jet) are multiplied by factors of ten to twenty. The template- 
based approach thus yields numbers that compare favorably with those found from other 
methods in the literature (see for example table 9 of Ref. [12] )• In addition, it allows for 
systematic improvement, for example by incorporating the effect of gluon emission in the 
template, or by weighting phase space by squared matrix elements. Because the template 
method naturally provides scatter plots like those in Fig. [3l we can imagine optimizing cuts 
on the data. We may also investigate improvements in the overlap functional Eq. iQ. 

Finally, we note that it is evident both from the scatter plots in Fig. [31 and the efficiency 
distributions in Fig. [H that the different generators tend to yield different energy flow pat- 
terns. In particular, the green dots on each of the three plots are the result of identical cuts 
over Ov and Pf. This was also noted earlier in the context of the jet mass distribution |15j . 
This observation should serve as a caution regarding the interpretation of tests for all meth- 
ods, especially those that rely heavily on the anticipated structure of soft radiation in final 
states. 



4 Two-particle Templates and Higgs Decay 

We now apply the template overlap method to boosted Higgs boson decays. The following 
discussion applies as well to electroweak bosons, because spin produces relatively small effects 
in the energy flow [T7]. We define the leading order templates in terms of the lowest-order 
decays of the Higgs, schematically, 

l/) = |/^)^"°^ = bi,P2). (10) 
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MC 


Jet mass 
Top-jet efficiency 


cut only 

%] fake rate [%] 


Mass cut + Ov 
Top-jet efficiency [%] 


+Pf 

fake rate [%] 


PythiaS 


58 


3.6 


21 


0.022 


MG/ME 


52 


3.7 


11 


0.017 


Sherpa 


34 


3.2 


7 


0.032 



Table 1: Efficiencies and fake rates for jets with R = 0.5 (using anti-Zcr: D = 0.5), 950 
GeV< Po <1050 GeV, 160 GeV< mj <190 GeV and mtop = 174 GeV. The left pair of 
columns shows efficiencies and fake rates found by imposing the jet mass window only. The 
right pair takes into account the effects of cuts in Ov and Pf in addition to the mass window. 
For the different MC simulations, we have imposed various cuts on Ov and Pf variables: 
for PythiaS Ov > 0.6 and Pf > 0.4, for MG/ME Ov > 0.7 and Pf > 0.39 and for Sherpa 
Ov > 0.6 and Pf > 0.48. 



As above, our template will be a set of discretized partonic states corresponding to given 
angular configurations. 

The task of disentangling a Higgs signal from a QCD background is actually more chal- 
lenging than for the top, because at lowest order both boosted Higgs and QCD jets consist of 
two particles. Nevertheless, looking only at the information given from the calorimeter, we 
can still obtain the measured energy distribution, dE{j)/dQ, and compare it to templates 
by an overlap function analogous to Eq. At lowest order, signal phase space for the 
Higgs is characterized by particularly simple kinematic parameters. For example, in boosted 
two-particle Higgs decays, h — )■ bb, we can characterize the final state at fixed Pq by the 
angle, 6s between the (two-particle) jet axis and the softer of the two particles. At fixed 
z = mj/Po <^ 1, the distribution in 9s is given by a "jet" function [17], 

rather strongly peaked for small 6s ^ z. When 6s approaches its minimum value, the decays 
are "democratic", sharing the energy of the Higgs nearly evenly between the pair. The 
distribution for lowest-order QCD events is still peaked, but much less so [17], 

djQ^P oc - (12) 
d6s 6s 

The two-particle phase space parameter 6s, of course, is not a physical quantity. We can, 
however, parameterize the two-particle peak template state, f[j] that a physical state, j 
most closely resembles, by matching energy flows, as discussed in Sec. [2l Once we have 
identified f[j], we can assign a value of 6s, or any other kinematic parameter of f[j], to 
the corresponding physical state j. Template overlaps enable us to make this identification, 
and therefore to make selections among data events based on quantifiable criteria. In what 
follows, we apply the peak overlap method introduced in Sec. [2] to Higgs decay. 
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4.1 Higgs templates 



We now describe the procedure for applying the peak template overlap method for Higgs, 
with a scheme for discretizing the data. 

4.1.1 Discretization of the data with jet mass and energy selection 

Given a set of data, we impose a jet mass window for the Higgs with a specific cone size R 
and discretize the data with a convenient mass and energy range: for our demonstration we 
choose the jet mass window to be 110 GeV < mj < 130 GeV, with Higgs mass chosen to be 
120 GeV, cone size R = 0.4 and jet energy 950 GeV < Pq < 1050 GeV. (For a full analysis, 
one can discretize the data with a certain step of energy, say 100 GeV, since jet energy is an 
input of our template function.) This gives us a set of final states j. 

For any state j, we determine the measured (or MC generated) energy distribution, 
dE{j)/dQ, in the physical O-cj) plane with respect to the jet axis for each reconstructed jet, 
and we can start discretizing data into a jet-energy configuration. In our demonstration for 
the Higgs, we discretize the 6-(j) plane into cells of size = 0.04 and A0 = 0.1. Next, we 
again assemble a table of energies i?(roWni, columun), where rowm and columun are the row 
and column number corresponding to the discretized values of 6 and (p. 

4.1.2 Construction of template function 

As for top decay, we generate our templates / from a set of discretized angles. For the 
two-body Higgs decay, two angles define the two-body state of the daughter particles. By 
analogy to the top case, we choose these as the polar and azimuthal angles in the Higgs rest 
frame, relative to the boost axis that links the Higgs rest frame with the lab frame. A simple 
Lorentz transformation determines the momenta of the daughter particles in the lab frame, 
where they are compared to data. 

Once again, we generate a large set of template states, so that we are confident of identi- 
fying the peak value of overlap. We discretize the angles with a small length of 27r/120. d We 
can now encode the two physical angles in terms of row and column numbers, corresponding 
to the data discretization scheme. Each template consists of the information (rowa, columua. 
Eg) for each of the two daughter particles. We exclude those templates having polar angles 
larger than the cone size R. Also, if desirable, we impose an energy cut on the template, 
i.e., removing templates with less than five percent of the total energy. 

I'ln fact, one can make it as small as one desires, since we will choose only one of them corresponding to 
"peak" template. 
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4.1.3 Two-particle template overlap 



We are now ready to implement Eq. (j5]) for the Higgs, by defining an overlap between 
templates, |/), and jet states Ov = {j\f)- Defined as above, our templates each have 
two cells corresponding to two daughter partons (g and q) with their row and column numbers 
determined by the data discretization scheme. 

As for the top, we compute the overlap between data state j and template / from an 
unweighted sum of all the energy in the total nine cells of state j surrounding and including 
the two occupied cells of template state /. In summary, we define the overlap of a template 
function with the energy distribution of the data to be 



Ov{j,f) 



max (H) exp 



-E^ E E EM-E 



,Ja) 



if) 



a=l 



. k = ia-l i=ia-l 



(13) 



where E{ia, ja)'^^'* is the energy in the template state for particle a. If one of the sums extends 
outside the jet cone, we set the corresponding energies E{k, I) to zero. Again, we fix cTa (for 
the ath parton) by that parton's energy, Oa = E{ia, jaY-^^'^, as in Eq. ([7]). 

In Fig. [5l we use the overlap, Eq. f fT3l) to validate the template function when compared 
to MC output events at partonic level, showing that each peak value is close to unity for all 
the events in our Higgs decaying into a bb sample. The points cluster even closer to unity 
for the Higgs than for the top. Fig. [21 because we have used a finer discretization for the 
(simpler) Higgs templates. 



< 




0.2 0.4 0.6 0.8 

Ov(Mj=mH, /•o=1000GeV) 



Figure 5: A scatter plot of template overlap, Eq. (1131) and the angular distance AR between 
the two partons coming from the Higgs decay, with Pq = 1 TeV, mn = 120 GeV. 
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Figure 6: Histogram of template overlap distributions (Eq. ( |T3l) ) Ov{j,f2) with MC- 
generated Higgs jets and QCD jets, where R = 0.4, 950 GeV< Pq <1050 GeV, 110 
GeV< mj <130 GeV and thh = 120 GeV. (MG/ME [37] with MLM matching [38].) 



4.2 Enhancing overlap with 9s and angularities 

We now apply the peak template overlap method to analyze energetic Higgs jet events vis- 
a-vis QCD jets. We use the data for QCD jet and hadronic Higgs jet events (after showering 
and hadroniztion), for R = 0.4, 950 GeV< Pq <1050 GeV, 110 GeV< mj <130 GeV and 
niH = 120 GeV as obtained from MG/ME [S] (with MLM matching [38]) via anti-^T jet 
clustering algorithm [3l] . 

Our aim is to understand how well we can discriminate our signal from the QCD back- 
ground using the simplest two-particle templates. In Fig. |6l we compare the template overlap 
Ov{j, f) distributions from Eq. f[T^ for Higgs and QCD jets. We see that Higgs jet events 
are peaked toward larger values of template overlap, Ov than QCD jets. We can therefore 
use a large Ov value as a quality cut, to ensure that the events under consideration are 
two-pronged like in terms of the energy flow, say, Ov > 0.85. Furthermore, even within the 
two-body description, we have seen that Higgs events tend to be peaked towards smaller 9s 
than the QCD jets. We hence expect to improve rejection power from an appropriate cut 
on 9s- 

To the extent that the energy flow of the jets is similar to that of two-body decay, their 
kinematics is determined by a single continuous variable, of which 9s is only one example. 
Indeed, we can use properties of the data itself as alternatives to ^s. A set of such alternatives 
is given the class of angularities, classifled by a parameter a and deflned by [T71B3] 




(14) 
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where Ui is the energy of a component inside the jet (such as a calorimeter tower). Limiting 
the parameter a < 2 ensures IR safety, as can be seen from the second expression on the 
right-hand side of the equation, which is vahd for small angle radiation 6i <^ 1. 

Angularities, fa, distinguish between Higgs and QCD jets in much the same way as 
the template angle 6s, as can be seen by examining the jet differential distributions in fa, 
analogous to Eqs. (fTTj) and (fT2l) for 6^. In particular, a simple approximation can be obtained 
when z = ttij/Pq <^ '^1 and a is negative with \a\ = 0{1), 

^0^^— ITT, (15) 

rather strongly peaked for small fa- As suggested above, these decays are "democratic", 
sharing the energy of the Higgs rather evenly between the pair. The distribution for lowest- 
order QCD events is still peaked at small fa, but less so, 

dJ^^^ 1 

,^ oc -—. 16 

dTa \a\ Ta 

We may thus expect that cuts of the data based on angularities will give results qualitatively 
similar to those based on Og. On the other hand, 6s, which is a parameter for two-body 
template states, already provides useful information on physical states, as well as a clear 
picture of their energy flow. 

We now analyze the effects of limiting the data to small 6s or small angularity. In the plot 
on the left of Fig. (Tj we compare the 6s distributions for Higgs and QCD jets, with a lower cut 
of template overlap Ov > 0.85, which confirms our understanding from Eqs. ( |TT|) and ( fT2|) . 
A cut 6s < 0.2 (or a corresponding cut on angularity) clearly removes a larger proportion 
of QCD jets than Higgs jets. In Fig. [HI we show efficiency vs. fake rates with various cuts 
on template overlap Ov. The curves correspond to a variation of the maximum size of the 
6s template parameter cut. Each is a scan from (which is fixed by the kinematics) to 
^max ^ 0.43. For a given cut on Ov, the efficiency is controlled by the variable. 



4.3 Planar flow for the Higgs 

So far, we have analyzed Higgs jets using only template overlaps based on LO partonic 
decay kinematics. In principle, the templates can be systematically improved by including 
the effects of gluon emissions, which contain color flow information [MfBS]. Actually, the 
effects of higher-order effects can be partly captured by using planar flow [T7], which we have 
already introduced for the top, and deflned in Eq. We expect soft radiation from the 
boosted color singlet Higgs to be concentrated between the b and b decay products. This is 
to be contrasted to a jet initiated by a light parton, whose color is correlated with particles 
in other parts of phase space, producing radiation in the gaps between those particles and 
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Figure 7: In the plot on the left (right), we show a histogram of 9s {f-2) with template 
overlap Ov > 0.85. We choose R = 0.4, 950 GeV< Pq <1050 GeV, 110 GeV< mj <130 
GeV and ttih = 120 GeV. (MG/ME [37] with MLM matching [38].) 




0.5 1 2 3 5 10 20 

Fake Rate [%] 



Figure 8: Fake rate vs. efficiency with various cuts on template overlap Ov and 6^, for 
R = 0.4, 950 GeV< Pq <1050 GeV, 110 GeV< mj <130 GeV and tuh = 120 GeV. The 
curves are the result of varying the maximal value of Og- Both efficiency and fake rates 
decrease as we lower the cut on Og. (MG/ME [37] with MLM matching [SB].) 
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the jet system. Therefore, we expect that planar flow for Higgs jets will be peaked toward a 
lower value than that of QCD jets. 



In Fig. [9l on the upper left, we show the Pf distributions for QCD jet and Higgs jet 
events. This panel of the figure confirms our expectation that Higgs jets tend to have smaller 
Pf values than QCD jets events (for the same z = rrij/Po). In the remaining panels, we 
show scatter plots of Pf vs. template overlap Ov, which show that both QCD and Higgs jets 
reflect two-pronged energy flow. In both cases, those events with large values of Ov tend to 
have relatively small values of Pf. We see, however, that the Higgs events yield somewhat 
smaller Pf, with a concentration of points at larger Ov in general, again in agreement with 
our heuristic expectations. 
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Figure 9: In the plot on the upper left, we show a histogram of Pf for Higgs jets and QCD 
jets. In the plot on the upper right, we show a scatter plot of Pf vs. template overlap Ov for 
Higgs jets. The remaining plot shows a scattering plot for QCD jets. Note the concentration 
of points for Higgs jets at larger values of Ov compared to QCD jets. We choose R = 0.4, 
950 GeV< Po <1050 GeV, 110 GeV< mj <130 GeV and tuh = 120 GeV. (MG/ME [37] 
with MLM matching [38] ) 



Finally, in Fig.[TOl we show the fake rate vs. efficiency when we combine template overlap, 
6s, and planar flow. In the plot on the right, we also show that angularities and 6^ indeed 
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have similar rejection powers. Once we combine the fake rate and efficiency from a jet mass 
cut (fake rate: 4.5%, efficiency: 79%) with template overlap, 6^^, and planar flow, we find, 
for example, at efficiency of 9.3%, a fake rate of 0.084% (with Ov > 0.5, Pf < 0.09, and 
Os < 0.2). 

Once again we point out that rejection power can be expected to improve once the 
template overlap is extended to take into account gluon emission. Another interesting but 
speculative aspect of our method is that, in principle, we can use the LEP data on Z decay 
and appropriately "boost" it to match for the relative kinematic regime to obtain an estimate 
of the all-orders template from the data itself. 




12 5 10 20 50 01234567 

Fake Rate [%] Fake Rate [%] 



Figure 10: On the left (obtained via MG/ME [37| with MLM matching [3H]), we show fake 
rate vs. efficiency, with various cuts of templates Ov, while varying the the value of Pf cut, 
corresponding to the change in efficiency. The dashed lines denote the case when Og < 0.2 
cut is implemented, while the solid lines have no 6s cut. In the plot on the right, we show 
fake rate vs. efficiency with Pf < 0.11 and template overlap cut, Ov > 0.1, while varying 
the value of 6^ or angularity f_2 cut, corresponding to the change in efficiency. We choose 
R = 0.4, 950 GeV< Pq <1050 GeV, 110 GeV< mj <130 GeV and = 120 GeV. 



5 Summary and Conclusions 

Template overlaps are a new class of infrared safe jet observables, based on functional com- 
parison of the energy flow in data with the flow in selected sets (the templates) of partonic 
states. We have demonstrated how, even with a relatively naive construction for the func- 
tional, template overlaps can be used to enrich samples of highly boosted particle decays in 
the presence of much larger QCD backgrounds. We have illustrated the method using lowest- 
order template states for highly-boosted Higgs and top decays, compared to the outputs of 
several event generators. This method, however, relies only on the infrared safety of energy 
flow, and is more general than boosted particle decay and may find other applications. 

Different event generators give different averages for our template overlaps, which is 
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not surprising since the energy distributions within the jets are expected to be sensitive to 
the showering mechanism, which at present has not been tested experimentally for these 
kinematical configurations. We nevertheless find in each case excellent, although variable, 
rejection power, defined as the ratio between the signal efficiency and the background fake 
rate. For the Higgs jet case we get a rejection power of order 1:100 and for a single top jet 
of order 1:1000 (PythiaS), 1:600 (MG/ME), 1:200 (Sherpa) when combined with a jet mass 
cut, with sizable efficiencies. The fact that these rejection powers were found to be strong 
in all cases is encouraging. It also suggests that the template overlap method is robust, in 
the sense that it is not overly sensitive to the treatment of soft physics. The latter clearly 
varies between the different generators, which cannot all reproduce the coming LHC data. 
Differences may be due to treatments of multiple interactions, minimum bias and underlaying 
events as well as showering mechanisms. The template overlaps described above are capable 
of systematic improvement by weighting according to the lowest order matrix elements (in 
different contexts, such an approach has been applied to Tevatron data [46]). We may also 
include higher order corrections in the template phase space. Other improvements may come 
from changing the functional that defines the overlap, or from more sophisticated cuts on 
the data. 
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